Finishing Repetitive Regions Automatically with Dupfinisher
نویسندگان
چکیده
Currently, the genome sequencing community is producing shotgun sequence data at a very high rate, but genome finishing is not keeping pace, even with the help from several automated finishing tools, such as autoFinish. One reason for the slow progress in finishing is that repetitive regions longer than the length of a sequencing read cannot be assembled correctly with many current assembly tools. Therefore, most repeat regions have to be checked manually. If finishing rates are to increase further, most repetitive regions must be assembled correctly and be finished in an automated fashion. The Dupfinisher computer program is designed to finish repeats with minimal human interaction. It can automatically detect repetitive regions, assemble each repeat individually using paired draft reads and primer walk reads, check the quality of these subassemblies, create artificial joins for finished and properly assembled repeats and run automated gap closure scripts on unfinished subassemblies. Dupfinisher is able to solve the majority of repeats in a microbial genome automatically, thus greatly reducing the amount of human attention needed. Dupfinisher has now been used in finishing more than 60 genomes and can be adapted to aid in finishing processes for whole bacterial genome and large insert clone projects.
منابع مشابه
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as...
متن کاملDevelopment and validation of an rDNA operon based primer walking strategy applicable to de novo bacterial genome finishing
Advances in sequencing technology have drastically increased the depth and feasibility of bacterial genome sequencing. However, little information is available that details the specific techniques and procedures employed during genome sequencing despite the large numbers of published genomes. Shotgun approaches employed by second-generation sequencing platforms has necessitated the development ...
متن کاملMapRepeat: an approach for effective assembly of repetitive regions in prokaryotic genomes
UNLABELLED The newest technologies for DNA sequencing have led to the determination of the primary structure of the genomes of organisms, mainly prokaryotes, with high efficiency and at lower costs. However, the presence of regions with repetitive sequences, in addition to the short reads produced by the Next-Generation Sequencing (NGS) platforms, created a lot of difficulty in reconstructing t...
متن کاملThe Neuroprotective Effects of Long-Term Repetitive Transcranial Magnetic Stimulation on the Cortical Spreading Depression-induced Damages in Rat’s Brain
Introduction: Cortical Spreading Depression (CSD) is a propagating wave of neural and glial cell depolarization with important role in several clinical disorders. Repetitive Transcranial Magnetic Stimulation (rTMS) is a potential tool with preventive treatment effects in psychiatric and neuronal disorders. In this paper, we study the effects of rTMS on CSD by using behavioral and histological a...
متن کاملPrediction of the response to repetitive transcranial magnetic stimulation by spectral powers of prefrontal regions of brain.
Introduction: Quantitative assessments of the effects induced by repetitive transcranial magnetic stimulation (rTMS) are crucial to develop more efficient and personalized treatments. Objectives: To determine the spectral powers of different subbands of EEG correlated with treatment response to rTMS. Materials and Methods: the spectral powers of different...
متن کامل